32 research outputs found

    Big continuous data: dealing with velocity by composing event streams

    No full text
    International audienceThe rate at which we produce data is growing steadily, thus creating even larger streams of continuously evolving data. Online news, micro-blogs, search queries are just a few examples of these continuous streams of user activities. The value of these streams relies in their freshness and relatedness to on-going events. Modern applications consuming these streams need to extract behaviour patterns that can be obtained by aggregating and mining statically and dynamically huge event histories. An event is the notification that a happening of interest has occurred. Event streams must be combined or aggregated to produce more meaningful information. By combining and aggregating them either from multiple producers, or from a single one during a given period of time, a limited set of events describing meaningful situations may be notified to consumers. Event streams with their volume and continuous production cope mainly with two of the characteristics given to Big Data by the 5V’s model: volume & velocity. Techniques such as complex pattern detection, event correlation, event aggregation, event mining and stream processing, have been used for composing events. Nevertheless, to the best of our knowledge, few approaches integrate different composition techniques (online and post-mortem) for dealing with Big Data velocity. This chapter gives an analytical overview of event stream processing and composition approaches: complex event languages, services and event querying systems on distributed logs. Our analysis underlines the challenges introduced by Big Data velocity and volume and use them as reference for identifying the scope and limitations of results stemming from different disciplines: networks, distributed systems, stream databases, event composition services, and data mining on traces

    Hybrid query plan generation

    No full text
    http://ceur-ws.org/Vol-911 - Regular PaperInternational audienceA hybrid query is a requirement of data produced by data services and a set of QoS preferences w.r.t. the query execution. In this paper we present the problem of the hybrid query optimization and, in particular, the generation of a search space of hybrid query plans. We show how the constraints for generating hybrid query plans are modeled and validate these constraints by implementing them in an action language. We present graphs with experiment results that show the complexity of this generation

    Big Data Management Challenges, Approaches, Tools and their limitations

    No full text
    International audienceBig Data is the buzzword everyone talks about. Independently of the application domain, today there is a consensus about the V's characterizing Big Data: Volume, Variety, and Velocity. By focusing on Data Management issues and past experiences in the area of databases systems, this chapter examines the main challenges involved in the three V's of Big Data. Then it reviews the main characteristics of existing solutions for addressing each of the V's (e.g., NoSQL, parallel RDBMS, stream data management systems and complex event processing systems). Finally, it provides a classification of different functions offered by NewSQL systems and discusses their benefits and limitations for processing Big Data

    Construction et manipulation de présentations spatio-temporelles multimédias à partir de serveurs d'objets répartis : applications aux données sur le Web

    No full text
    We propose an infrastructure (JAGUAR) for specifying spatio-temporal multimedia presentations managers. These managers are mediators between applications and distributed heterogeneous object sources accessible through the Web (via their URL). A manager is able to define, query, and build presentations that are stored in a multimedia database system. All objects are described by a schema that associates a default presentation to each of them. We defined a spatio-temporal model to integrate objects and presentations. The model describes objects composition through spatio-temporal relations. We also proposed OQLiST a language that provides spatio-temporal operators. Language and model can be used for specifying, querying and representing homogeneously inter and intra-media descriptions. In order to validate the JAGUAR infrastructure, a presentation manager prototype was implemented using SMIL and Java Media Framework (JMF) platforms. The manager has been used for specifying and implementing (1) a touristic application and (2) a visualization tool for objects stored in a data warehouse. For the later, the manager provides a specific schema for the data cube.Cette thèse propose une infrastructure pour la spécification de gestionnaires de présentations multimédias spatio-temporelles (JAGUAR). Un gestionnaire assure la définition, le stockage, l'interrogation et la mise en forme des présentations qui sont stockées dans un système de gestion de bases de données multimédias. Les gestionnaires spécifiés servent de médiateurs entre des applications multimédias et des sources hétérogènes d'objets réparties et accessibles au travers du Web (via leur URL). Les objets gérés par un gestionnaire pour un ensemble d'applications sont décrits par un schéma de données qui associe à chaque type une présentation par défaut. Ainsi ce schéma définit la manière dont les objets sont vus par les applications. Un modèle spatio-temporel a été défini pour traiter de manière homogène les objets intégrés dans les présentations. Ce modèle permet également de décrire un objet composé en mettant en évidence les relations spatio-temporelles entre ses composants. Nous proposons également le langage OQLiST qui intègre des opérateurs spatiaux et temporels pour la spécification et l'interrogation de présentations. Ainsi, le langage et le modèle permettent de spécifier, d'interroger et de représenter de manière homogène les descriptions inter et intra-médias. Un prototype de gestionnaire a été implanté pour valider l'infrastructure JAGUAR. Il a été adapté à la construction d'applications sur des plates-formes SMIL et Java Media Framework (JMF). Ce gestionnaire a été utilisé pour la spécification et la mise en oeuvre d'une application touristique. Il a également été employé pour spécifier un outil de visualisation d'objets stockés dans un entrepo>t de données

    Integración de datos espaciales a partir de fuentes heterogéneas y distribuidas

    No full text
    El presente trabajo discute los retos asociados a la integración de datos espaciales y presenta el enfoque adoptado en elproyecto SPIDHERS (SPatial data Integraction from Distributed and HEteRogeneous Sources – Integración de Datos Espaciales apartir de Fuentes Heterogéneas y Distribuidas) basado en técnicas de mediación, recuperación, análisis y extracción eficientes dedatos. Finalmente, este trabajo muestra la forma en que se puede utilizar la tecnología producida en SPIDHERS para integrar,analizar y tomar decisiones utilizando datos del Popocatépetl, volcán ubicado en la región de Puebla y la Ciudad de México

    Indexación multidimensional configurable

    No full text
    Existe una gran cantidad de métodos de indexado para datos multidimensionales. La idea fundamental de éstos esgenerar estructuras dinámicas para organizar objetos complejos, de tal manera que se puedan consultar de forma rápida y efectiva.Aunque existen taxonomías que definen las propiedades de cada método de indexado. A un usuario no experto le es difícil decidircuál método podría ser apropiado para un conjunto particular de datos. En este artículo describimos la arquitectura de unframework el cual ofrece herramientas de análisis e implementación de diversos métodos de indexado multidimensional y queayuda a un usuario a determinar el método más adecuado, para un conjunto de datos. Además se analizan ciertas propiedades delos mismos y el tipo de consultas que se llevarán a cabo en ellos

    Optimizing Data Processing Service Compositions Using SLA's

    No full text
    International audienceThis paper proposes an approach for optimally accessing data by coordinating services according to Service Level Agreements (SLA) for answering queries. We assume that services produce spatio-temporal data through Application Programming Interfaces (API's). Services produce data periodically and in batch. Assuming that there is no full-fledged DBMS providing data management functions, query evaluation (continuous, recurrent or batch) is done through reliable service coordinations guided by SLAs. Service coordinations are optimized for reducing economic, energy and time costs

    ANDROMEDA: Astronomical Data Mediation for Virtual Observatories

    No full text
    Research report: 21 pagesResearch report IMAG-LSRThis paper ANDROMEDA, an astronomical data mediation system that enables transparent access to astronomical data servers. Transparent access is achieved by a global view that expresses requirements of community of users (e.g., astronomers) and data integration mechanisms adapted to astronomical data characteristics. Instead of providing an ad hoc mediator, ANDROMEDA can be configured for giving access to different data servers according to different user requirements (data types, content, data quality, and provenance). ANDROMEDA can be also adapted when new sources are integrated to the community or new requirements are specified
    corecore